December 8, 2024
This project provides an in-depth analysis of labor force statistics
sourced from the Eurostat lfsa_egan dataset. The dataset
includes a wealth of information on labor market participation, broken
down by demographic factors such as sex, age,
citizenship, and geographical regions across European
countries. The goal is to explore how these factors influence employment
trends over time and how they vary across different countries and
regions.
In addition to the statistical analysis, the project also includes a geospatial analysis of labor market trends for the year 2023. This analysis uses interactive maps to visualize employment patterns across different European countries and regions.
The analysis focuses on several key aspects, including the evolution of Employment Trends, with particular attention to the differences between citizenship groups, age categories, and genders. Additionally, the project explores employment disparities by gender, as well as the relationship between employment levels and both gender and citizenship categories.
Through the use of R’s interactive visualizations, the project further compares employment patterns before and after the EU enlargements in 2000 and 2007, as well as the impacts of two significant economic crises in 2010 and 2023. The findings aim to provide valuable insights into the changing landscape of the European labor market, highlighting the factors that contribute to employment dynamics and the ongoing challenges in achieving gender equality in labor force participation.
Loading the necessary libraries for data manipulation, analysis, and visualization.
# Load necessary libraries
library(tidyverse)
library(eurostat)
library(FactoMineR)
library(factoextra)
library(ggplot2)
library(dplyr)
library(forcats)
library(tidyverse)
library(lubridate)
library(plotly)
library(scales)
library(patchwork)
library(gridExtra)
library(highcharter)
library(knitr)
library(kableExtra)
library(htmltools)
library(htmlwidgets)The lfsa_egan dataset was obtained from EUROSTAT’s
official website. This dataset belongs to the Labour Force
Survey (LFS) and contains annual employment data categorized by
different demographics and economic dimensions.
-
Dataset: 496671 observations and 8 variables.
-
Types of variables: it mainly contains categorical
variables and one numerical variable.
## indexed 0B in 0s, 0B/sindexed 1.00TB in 0s, 55.08TB/s
In order to have a better understanding of the dataset, a brief description for each variable and the corresponding unique values are provided.
| Variable | Description | Type | Unique_values |
|---|---|---|---|
| TIME_PERIOD | Sampling year | date |
1995-01-01 1996-01-01 1997-01-01
1998-01-01 1999-01-01 2000-01-01
2001-01-01 2002-01-01 2003-01-01
2004-01-01 2005-01-01 2006-01-01
2007-01-01 2008-01-01 2009-01-01
2010-01-01 2011-01-01 2012-01-01
2013-01-01 2014-01-01 2015-01-01
2016-01-01 2017-01-01 2018-01-01
2019-01-01 2020-01-01 2021-01-01
2022-01-01 2023-01-01
|
| geo | Geopolitical entity | chr |
AT BE CH CY
CZ DE DK EA20
EE EL ES EU27_2020
FI FR HU IE
IS IT LU ME
MT NL NO PT
RS SE SI SK
UK BG HR LT
LV MK PL RO
BA TR
|
| citizen | Country of citizienship | chr |
EU27_2020_FOR FOR NAT
NEU27_2020_FOR NRP STLS
TOTAL
|
| age | Age ranges | chr |
Y15-19 Y15-24 Y15-39
Y15-59 Y15-64 Y15-74
Y20-24 Y20-64 Y25-29
Y25-49 Y25-54 Y25-59
Y25-64 Y25-74 Y30-34
Y35-39 Y40-44 Y40-59
Y40-64 Y45-49 Y50-54
Y50-59 Y50-64 Y50-74
Y55-59 Y55-64 Y60-64
Y65-69 Y65-74 Y70-74
Y_GE15 Y_GE25 Y_GE50
Y_GE65 Y_GE75
|
| unit | Measurement’s unit for employment | chr |
THS_PER (Thousand persons)
|
| sex | Gender | chr |
F (Female) M (Male) T (Total)
|
| freq | Sampling frequency | chr |
A (Annual)
|
| values | Values for each variable’s combination | num | numerical values |
And a description of each Geopolitical Entity code is provided:
| Code | Country Name |
|---|---|
| AT | Austria |
| BE | Belgium |
| CH | Switzerland |
| CY | Cyprus |
| CZ | Czech Republic |
| DE | Germany |
| DK | Denmark |
| EE | Estonia |
| EL | Greece |
| ES | Spain |
| FI | Finland |
| FR | France |
| HU | Hungary |
| IE | Ireland |
| IS | Iceland |
| IT | Italy |
| LU | Luxembourg |
| ME | Montenegro |
| MT | Malta |
| NL | Netherlands |
| NO | Norway |
| PT | Portugal |
| RS | Serbia |
| SE | Sweden |
| SI | Slovenia |
| SK | Slovakia |
| UK | United Kingdom |
| BG | Bulgaria |
| HR | Croatia |
| LT | Lithuania |
| LV | Latvia |
| MK | North Macedonia |
| PL | Poland |
| RO | Romania |
| BA | Bosnia and Herzegovina |
| TR | Turkey |
| EU27_2020 | European Union member states as of 2020, encompassing all 27 EU countries. |
| EA20 | Euro Area (20 countries). It refers to EU member states that have adopted the euro as their currency by 2023 |
The dataset contains 1 continuous and 7 categorical variables. Out of these, two variables present only one category, so there is no need to explore them. Some interactive visualizations are displayed, showing the distribution of Missing Values and the following categorical variables: sex, age, geo, citizen, and TIME_PERIOD. Users can zoom in on the plot to better visualize the values when they are few.
To gain a deeper understanding of the changes in employment levels across various citizenship groups from 1995 to 2023, the following analysis was conducted. This analysis aims to identify key trends, anomalies and countrie’s contribution over the specified time period.
The dataset was filtered into two subsets: one grouping all
European Countries and the other grouping the European
Union Member States as of 2020 (EU27_2020). Both
subsets display values for the Working Age Population
(Y15-64) and include aggregated data for both Females and
Males.
library(dplyr)
library(lubridate)
library(highcharter)
plot_employment_trends <- function(data,
age_filter = "Y15-64",
sex_filter = "T",
citizen_filter = unique(data$citizen),
tickInterval = 50000000,
include_geo = c("EU27_2020", "CH", "IS", "ME", "NO", "RS", "UK", "MK", "BA", "TR"),
plot_title = "Employment Trends by Citizenship Group (All European Countries)") {
# Filter data for the specified age, sex, and only include specified countries
working_age_data <- data %>%
filter(
age == age_filter,
sex == sex_filter,
geo %in% include_geo
) %>%
mutate(
year = year(TIME_PERIOD),
employment_real = values * 1000 # Convert employment values to actual numbers
)
# Group employment by year and citizenship
time_trend <- working_age_data %>%
group_by(TIME_PERIOD, citizen) %>%
summarise(total_employment = sum(employment_real, na.rm = TRUE)) %>%
ungroup()
# Create the Highcharter plot
hchart <- highchart() %>%
hc_chart(type = "line") %>%
hc_title(text = plot_title) %>%
hc_subtitle(text = paste("Working Age:", age_filter, "| Females and Males together")) %>%
hc_xAxis(
type = "datetime",
title = list(text = "Year"),
labels = list(format = "{value:%Y}")
) %>%
hc_yAxis(
title = list(text = "Total Employment"),
labels = list(format = "{value:,.0f}"), # Formats large numbers with commas
tickInterval = tickInterval
) %>%
hc_tooltip(
shared = TRUE,
crosshairs = TRUE,
pointFormat = "<b>{series.name}</b> - <b>{point.y:,.0f}</b><br>"
) %>%
hc_colors(c(
"#003399", # EU27_2020_FOR (Dark Blue)
"#DAA520", # FOR (Goldenrod)
"#228B22", # NAT (Forest Green)
"#00CED1", # NEU27_2020_FOR (Dark Turquoise)
"#1E90FF", # NRP (Dodger Blue)
"#9370DB", # STLS (Medium Purple)
"#FF1493" # TOTAL (Deep Pink)
)) %>%
hc_plotOptions(
series = list(marker = list(enabled = FALSE))
)
# Add series for each citizenship group
citizenship_groups <- unique(time_trend$citizen)
for (citizen in citizenship_groups) {
citizen_data <- time_trend %>% filter(citizen == !!citizen)
hchart <- hchart %>%
hc_add_series(
data = list_parse2(data.frame(
x = as.numeric(as.POSIXct(citizen_data$TIME_PERIOD)) * 1000,
y = citizen_data$total_employment
)),
name = citizen
)
}
# Define event annotations (vertical dashed lines)
event_annotations <- data.frame(
date = as.Date(c("2001-01-01", "2004-01-01", "2007-01-01", "2008-01-01", "2020-01-01"))
)
plot_lines <- lapply(1:nrow(event_annotations), function(i) {
list(
color = "#084594", # Dark blue for the line
width = 2, # Line thickness
value = as.numeric(as.POSIXct(event_annotations$date[i])) * 1000,
dashStyle = "Dash", # Dashed line style
zIndex = 3, # Ensure lines are above the plot
label = list(
text = "", # No static text
style = list(fontSize = "0px") # Hide the label
)
)
})
# Add the plot lines to the Highcharter plot
hchart <- hchart %>%
hc_xAxis(plotLines = plot_lines) %>%
hc_exporting(enabled = TRUE) %>%
hc_legend(enabled = TRUE)
return(hchart)
}
plot_employment_trends(data)Citizen Group TOTAL: This category represents the
sum of FOR, NAT, STLS, and
NRP. It does not include EU27_2020_FOR and
NEU27_2020_FOR.
The dotted blue lines represent key events that may have influenced employment trends:
2001: The combination of preparations for Euro adoption, economic growth, labor reforms, and anticipation of EU enlargement contributed to the significant rise in employment levels. This period marked optimism and structural changes that positively impacted employment across various sectors in the EU.
2004: EU Enlargement led to a notable increase in employment levels, particularly for native citizens and total employment.
2007: Bulgaria and Romania’s accession to the EU continued to drive an upward trend in employment levels.
2008: The Financial Crisis caused a sharp decline in employment trends due to widespread business failures, reduced investments, and economic contraction, resulting in significant job losses across many sectors in the EU.
2020: The COVID-19 Pandemic led to a slight dip in employment, reflecting the economic disruptions during this period.
plot_employment_trends(data, include_geo = c("AT", "BE", "BG", "HR", "CY", "CZ", "DK", "EE", "FI", "FR", "DE", "EL", "HU", "IE", "IT", "LV", "LT", "LU", "MT", "NL", "PL", "PT", "RO", "SK", "SI", "ES", "SE"), age_filter = "Y15-64", citizen_filter = unique(data$citizen), plot_title = "Employment Trends by Citizenship Group (EU27_2020)")Employment in the European Union Member States experienced an overall
steady growth during the sampling period. National employment
(NAT) appears to be the most positively affected category
by EU enlargement, with a notable shift around this event. Additionally,
the “No Responses” category (NRP) dropped significantly
around the same time period.
Employment levels were influenced by both the 2008 financial crisis and the COVID-19 pandemic. Total employment decreased from 190 million in 2008 to nearly 184 million in 2010, and again from almost 195 million in 2019 to 193 million in 2021.
The following plot shows the employment trend, of the working population, for all countries that are not European Union Member States. These countries are: Switzerland, Iceland, Montenegro, Norway, Serbia, Ukraine, North Macedonia, Bosnia and Herzegovina, and Turkey.
plot_employment_trends(data, include_geo = c("CH", "IS", "ME", "NO", "RS", "UK", "MK", "BA", "TR"), age_filter = "Y15-64", citizen_filter = unique(data$citizen), plot_title = "Employment Trends by Citizenship Group (non EU27)")The distribution of the data here is not always homogeneous,
highlighting possible anomalies within the data collection process. A
significant increase in total employment levels is noticeable after
2004, primarily driven by the variable No Response
(NRP), which contributed substantially to this rise. The
2008 financial crisis does not appear to have had a major impact on
total employment, while the COVID-19 Pandemic caused a
substantial decrease in total employment levels, marking the biggest
drop since 2005. This decline mainly affected Nationals
(NAT).
To further investigate employment levels, the contribution of each EU
Member State was explored with the following stacked plot. The dataset
was filtered to remove the citizen categories TOT,
EU27_2020, and EA20 to avoid double counting.
The working age and sex variables remained unchanged.
library(dplyr)
library(tidyr)
library(highcharter)
library(purrr)
library(lubridate)
eu27_countries <- c("AT","BE","BG","CY","CZ","DE","DK","EE","EL",
"ES","FI","FR","HR","HU","IE","IT","LT","LU",
"LV","MT","NL","PL","PT","RO","SE","SI","SK")
# Filter dataset for desired age, sex, and EU27 countries
temporary_data <- data %>%
filter(
age == "Y15-64",
sex == "T",
geo %in% eu27_countries
) %>%
mutate(
year = year(TIME_PERIOD),
employment_real = values * 1000
)
# Keep only FOR, NAT, STLS, NRP and aggregate them into MY_TOTAL
eu_data <- temporary_data %>%
filter(citizen %in% c("FOR", "NRP", "STLS", "NAT")) %>%
group_by(year, geo) %>%
summarise(MY_TOTAL = sum(employment_real, na.rm = TRUE), .groups = "drop") %>%
# Exclude EU27_2020 and EA20
filter(!geo %in% c("EU27_2020", "EA20"))
# Pivot data so each geo is a column, one row per year
eu_data_wide <- eu_data %>%
pivot_wider(
names_from = geo,
values_from = MY_TOTAL,
values_fill = 0
)
# Calculate total contribution per country across all years
country_totals <- colSums(eu_data_wide[,-1]) # exclude the 'year' column
# Sort countries by total contribution (descending)
sorted_countries <- names(sort(country_totals, decreasing = TRUE))
# Reorder columns according to sorted countries
eu_data_wide <- eu_data_wide[, c("year", sorted_countries)]
# Prepare the series list as [x,y] pairs for each country
series_list <- lapply(sorted_countries, function(country) {
data_pairs <- Map(function(x_val, y_val) list(x = x_val, y = y_val), eu_data_wide$year, eu_data_wide[[country]])
list(
name = country,
data = data_pairs
)
})
# Create the stacked area chart
hchart <- highchart() %>%
hc_chart(type = "area") %>%
hc_title(text = "EU27_2020 Countries Total Contribution Over Time") %>%
hc_xAxis(
title = list(text = "Year"),
labels = list(format = "{value}")
) %>%
hc_yAxis(
title = list(text = "Total Employment"),
labels = list(format = "{value:,.0f}")
) %>%
hc_plotOptions(
area = list(stacking = "normal", marker = list(enabled = FALSE))
) %>%
hc_tooltip(
shared = TRUE,
pointFormat = "<b>{series.name}:</b> {point.y:,.0f}<br>",
headerFormat = "<b>Year: {point.x}</b><br/>"
) %>%
hc_add_series_list(series_list) %>%
hc_legend(enabled = TRUE) %>%
hc_exporting(enabled = TRUE)
hchartGermany appears to consistently have the highest employment levels, followed, for most of the sampled years, by France, Italy, Spain, and Poland. Until 2001, some countries had missing values, highlighting potential anomalies, likely due to data collection issues. However, after 2001, data became available for every year without any gaps.
Until now, employment trends have been analyzed based on the
Working Age population (Y15-64). The
lfsa_egan dataset contains 35 different age groups.
However, the following analysis focuses on three key age groups:
plot_employment_trends(data, include_geo = c("AT", "BE", "BG", "HR", "CY", "CZ", "DK", "EE", "FI", "FR", "DE", "EL", "HU", "IE", "IT", "LV", "LT", "LU", "MT", "NL", "PL", "PT", "RO", "SK", "SI", "ES", "SE"), age_filter = "Y15-24", citizen_filter = unique(data$citizen), tickInterval = 5000000, plot_title = "Employment Trends for Youths (EU27_2020)")plot_employment_trends(data, include_geo = c("AT", "BE", "BG", "HR", "CY", "CZ", "DK", "EE", "FI", "FR", "DE", "EL", "HU", "IE", "IT", "LV", "LT", "LU", "MT", "NL", "PL", "PT", "RO", "SK", "SI", "ES", "SE"), age_filter = "Y25-54", citizen_filter = unique(data$citizen), tickInterval = 20000000, plot_title = "Employment Trends for Prime Working Age (EU27_2020)")plot_employment_trends(data, include_geo = c("AT", "BE", "BG", "HR", "CY", "CZ", "DK", "EE", "FI", "FR", "DE", "EL", "HU", "IE", "IT", "LV", "LT", "LU", "MT", "NL", "PL", "PT", "RO", "SK", "SI", "ES", "SE"), age_filter = "Y55-64", citizen_filter = unique(data$citizen), tickInterval = 10000000, plot_title = "Employment Trends for Older Working Age (EU27_2020)")As of 2023, Youth represent 8.3% of total employment in the EU Member States, while individuals in the Prime Working Age represent 71.9%, and those in the Older Working Age account for 19.8%. The latter group does not appear to have been negatively affected by the two crises, while both the Youth and Prime Working Age groups showed a decrease in employment after the 2008 Financial Crisis and the COVID-19 Pandemic.
From 2008 to 2010, employment in Youth decreased by 13.2% (resulting in 2.4 million fewer employed people), and in the Prime Working Age group, it decreased by 3.5% (resulting in 5.3 million fewer employed people). From 2019 to 2020, employment in Youth decreased by 5.8% (resulting in approximately 900,000 fewer employed people), and in the Prime Working Age group, it decreased by 2.1% (resulting in 3 million fewer employed people).
The effects of the financial crisis seem to have lasted longer, reaching their lowest levels around 2013-2014, compared to those of the pandemic, where employment began to recover after only two years.
Another dataset from EUROSTAT was used for this analysis:
the dataset lfsi_emp_a. This dataset contains similar data
to the previous one, but with additional information, specifically the
percentage of the working population (PC_POP). This added
variable provides a deeper insight into the composition of the workforce
across different countries and years.
## indexed 0B in 0s, 0B/sindexed 1.00TB in 0s, 340.25TB/s
The percentage of the working population (Y15-64) for
2023 is displayed in the following plot.
countries <- c(
"AT", "BE", "BG", "HR", "CY", "CZ", "DK", "EE", "FI", "FR", "DE", "EL", "HU", "IE", "IT", "LV", "LT", "LU", "MT", "NL", "PL", "PT", "RO", "SK", "SI", "ES", "SE", "CH", "IS", "ME", "NO", "RS", "UK", "MK", "BA", "TR")
df <- data2 %>%
filter(age == "Y15-64",
geo %in% countries,
indic_em == "EMP_LFS",
sex == "T",
unit == "PC_POP") %>%
mutate(year = year(TIME_PERIOD)) %>%
select(geo, year, values)
# Define color classes (example, adjust ranges accordingly)
dta_clss <- list(
list(from = 0, to = 55, color = "#EBEBEB", name = "<55"),
list(from = 55, to = 65, color = "#BFD0E4", name = "55-65"),
list(from = 65, to = 75, color = "#7FA1C9", name = "65-75"),
list(from = 75, to = 82, color = "#4073AF", name = "75-82"),
list(from = 82, to = 100, color = "#003776", name = ">82")
)
# Function to generate the map for a specific year
generate_map <- function(year_selected) {
# Filter the data for the selected year
df_year <- df %>%
filter(year == year_selected)
# Generate the map
hc <- hcmap("custom/europe",
data = df_year,
joinBy = c("iso-a2", "geo"),
name = "Employment Rate",
value = "values",
tooltip = list(pointFormat = "{point.value}%"),
dataLabels = list(
enabled = TRUE,
format = "{point.value}%" # Display name and value with "%"
)) %>%
hc_colorAxis(dataClassColor = "category",
dataClasses = dta_clss) %>%
hc_title(text = paste("Employment Rate, ", year_selected)) %>% # Dynamically update title
hc_subtitle(text = "(% of people aged 15-64)")
return(hc)
}
# Generate the initial map for 2020 (or any other starting year)
generate_map(2023)To explore how employment levels is distributed across Genders, some interactive plots were displayed.
The following plot shows the employment levels by Gender for
individuals aged 15-64 (Y15-64) across all EU countries,
excluding EU-wide aggregates to avoid double counting. No
differentiation was made regarding the Citizen Groups
(FOR, NAT, STLS, and
NRP were considered together). A dedicated function was
built to generate these specific plots.
library(dplyr)
library(lubridate)
library(highcharter)
plot_employment_levels_by_gender <- function(
data,
age_filter = "Y15-64",
citizen_filter = c("FOR", "NAT", "STLS", "NRP"),
sex_filter = c("M", "F"),
geo_filter = c("EU27_2020", "CH", "IS", "ME", "NO", "RS", "UK", "MK", "BA", "TR"),
chart_title = "Employment Levels by Gender for the Working Age Population (15-64) Over Time",
chart_subtitle = "All countries (excluding EU-major aggregates)",
male_color = "#003399",
female_color = "#FFCC00"
) {
# Filter and preprocess the data
filtered_data <- data %>%
filter(
age == age_filter,
citizen %in% citizen_filter,
sex %in% sex_filter,
geo %in% geo_filter
) %>%
mutate(
year = year(TIME_PERIOD),
employment_real = values * 1000 # Convert employment values to actual numbers
)
# Summarize employment by year and sex
summarized_data <- filtered_data %>%
group_by(year, sex) %>%
summarise(total_employment = sum(employment_real, na.rm = TRUE)) %>%
ungroup()
# Create the Highcharter bar plot
hchart <- highchart() %>%
hc_chart(type = "column") %>%
hc_title(text = chart_title) %>%
hc_subtitle(text = chart_subtitle) %>%
hc_xAxis(
categories = unique(summarized_data$year),
title = list(text = "Year"),
labels = list(rotation = 45) # Rotate x-axis labels for readability
) %>%
hc_yAxis(
title = list(text = "Total Employment"),
labels = list(format = "{value:,}") # Format y-axis with commas
) %>%
hc_plotOptions(column = list(grouping = TRUE)) %>%
hc_tooltip(shared = TRUE, pointFormat = "<b>{series.name}</b>: {point.y:,.0f}<br>") %>%
hc_add_series(
data = summarized_data %>% filter(sex == "M") %>% pull(total_employment),
name = "Male",
color = male_color
) %>%
hc_add_series(
data = summarized_data %>% filter(sex == "F") %>% pull(total_employment),
name = "Female",
color = female_color
) %>%
hc_legend(enabled = TRUE) %>%
hc_exporting(enabled = TRUE)
return(hchart)
}
plot_employment_levels_by_gender(data)Considering the absolute values, the number of employed males was always higher than the number of employed females during the considered time frame. The COVID-19 Pandemic seems to have had a greater impact on employment levels compared to the 2008 Financial Crisis.
The same plot type is displayed twice again: the first considers only EU Member States, and the second considers non-EU Member States, in both cases keeping all other variables unchanged.
plot_employment_levels_by_gender(
data,
age_filter = "Y15-64",
citizen_filter = c("FOR", "NAT", "STLS", "NRP"),
sex_filter = c("M", "F"),
geo_filter = c("AT", "BE", "BG", "HR", "CY", "CZ", "DK", "EE", "FI", "FR", "DE", "EL", "HU", "IE", "IT", "LV", "LT", "LU", "MT", "NL", "PL", "PT", "RO", "SK", "SI", "ES", "SE"
),
chart_title = "Employment Levels by Gender for the Working Age Population (15-64) Over Time",
chart_subtitle = "All EU Member States",
male_color = "#003399",
female_color = "#FFCC00"
)plot_employment_levels_by_gender(
data,
age_filter = "Y15-64",
citizen_filter = c("FOR", "NAT", "STLS", "NRP"),
sex_filter = c("M", "F"),
geo_filter = c("CH", "IS", "ME", "NO", "RS", "UK", "MK", "BA", "TR"),
chart_title = "Employment Levels by Gender for the Working Age Population (15-64) Over Time",
chart_subtitle = "non-EU Member States",
male_color = "#003399",
female_color = "#FFCC00"
)Female employment levels consistently appear lower than male values, both within EU Member States and non-EU Member States. As of 2023, in relative terms, the disparity between employed males and females is larger in the non-EU Member States. EU enlargement policies seem to have significantly increased employment levels. However, for non-EU Member States this increase primarily affected males. The 2008 Financial Crisis appears to have had a more significant negative impact on EU Member States, while the COVID-19 Pandemic seems to have affected non-EU Member States more.
The following interactive visualization allows users to select
different years to observe how employment levels have changed over time
across countries (including the aggregates EA20 and
EU27_2020) for the working population (15-64), providing
insights into gender-specific employment trends and patterns.
The following interactive visualization allows users to select
different years to observe how employment levels have changed over time
across countries (including the aggregates EA20 and
EU27_2020) for the working population (15-64), providing
insights into gender-specific employment trends and
patterns.
library(dplyr)
library(tidyr)
library(highcharter)
library(htmlwidgets)
# Filter the data for working age (Y15-64), Males (M), Females (F), and citizen TOTAL
filtered_data <- data %>%
filter(
age == "Y15-64",
sex %in% c("M", "F"),
citizen == "TOTAL"
) %>%
mutate(
year = year(TIME_PERIOD),
employment_real = values * 1000 # Convert employment values to actual numbers
)
# Summarize employment by year, country, and sex
summarized_data <- filtered_data %>%
group_by(year, geo, sex) %>%
summarise(total_employment = sum(employment_real, na.rm = TRUE)) %>%
ungroup()
# Ensure all combinations of year, geo, and sex exist, filling missing ones with zero
summarized_data <- summarized_data %>%
complete(year, geo, sex, fill = list(total_employment = 0))
# Define the correct order of countries with EA20 and EU27_2020 first
geo_levels <- c("EA20", "EU27_2020", setdiff(unique(summarized_data$geo), c("EA20", "EU27_2020")))
# Apply the factor order to geo
summarized_data$geo <- factor(summarized_data$geo, levels = geo_levels)
# Create a highcharter plot with a dropdown for year selection
years <- unique(summarized_data$year)
# Initialize highchart
hchart <- highchart() %>%
hc_chart(type = "column") %>%
hc_title(text = "Employment Levels by Gender and Country (Working Age: 15-64)") %>%
hc_subtitle(text = "Interactive Year Selection | Including EA20 and EU27_2020") %>%
hc_xAxis(
categories = levels(summarized_data$geo),
title = list(text = "Country"),
labels = list(
rotation = 45,
style = list(fontSize = "8px")
)
) %>%
hc_yAxis(
title = list(text = "Total Employment"),
labels = list(format = "{value:,}")
) %>%
hc_plotOptions(column = list(grouping = TRUE)) %>%
hc_tooltip(shared = TRUE, pointFormat = "<b>{series.name}</b>: {point.y:,.0f}<br>")
# Add series for each year
for (yr in years) {
year_data <- summarized_data %>% filter(year == yr)
hchart <- hchart %>%
hc_add_series(
data = year_data %>% filter(sex == "M") %>% arrange(geo) %>% pull(total_employment),
name = paste0("Male - ", yr),
color = "#003399",
visible = ifelse(yr == min(years), TRUE, FALSE)
) %>%
hc_add_series(
data = year_data %>% filter(sex == "F") %>% arrange(geo) %>% pull(total_employment),
name = paste0("Female - ", yr),
color = "#FFCC00",
visible = ifelse(yr == min(years), TRUE, FALSE)
)
}
# Add year selection dropdown and exporting options with a toggleable legend
hchart <- hchart %>%
hc_exporting(enabled = TRUE) %>%
hc_legend(enabled = TRUE) %>%
hc_chart(events = list(
load = JS("
function() {
var chart = this;
var legendVisible = true;
var button = chart.renderer.button('Toggle Legend', 10, 10)
.on('click', function() {
legendVisible = !legendVisible;
chart.update({
legend: { enabled: legendVisible },
xAxis: {
labels: {
style: {
fontSize: '8px'
},
rotation: 45
}
}
});
// Adjust the chart margins dynamically
if (!legendVisible) {
chart.update({
chart: { marginRight: 50 }
});
} else {
chart.update({
chart: { marginRight: 150 }
});
}
})
.add();
}")
))
# Display the chart
hchartA comparative analysis was conducted to explore how employment levels vary between different countries for specific citizenship categories.
For this analysis, a subset was created from the initial dataset,
including only the Working Population (Y15-64)
with no differentiation by gender (T). Specific
years were selected for analysis:
Heatmaps were generated for each of these years, and two comparisons were made.
Heatmaps for 2000 and 2007 were generated and compared to capture the
differences in employment levels across countries before and
after the EU enlargements. All EU countries
have been inserted and EU-major aggregates (EU27_2020 and
EA20) have been excluded. Aggregates for Citizen groups
were removed to avoid double counting.
library(dplyr)
library(plotly)
# Filter data for selected years and only 'Total' for sex and working-age population (15-64)
selected_years <- c("2000-01-01", "2007-01-01")
filtered_data <- data %>%
filter(
TIME_PERIOD %in% as.Date(selected_years),
sex == "T",
age == "Y15-64",
!citizen %in% c("EU27_2020_FOR", "NEU27_2020_FOR", "TOTAL"),
!geo %in% c("EU27_2020", "EA20") # Exclude "EU27_2020" and "EA20"
)
# Group employment by country, citizenship, and year
country_citizen_summary <- filtered_data %>%
group_by(geo, citizen, TIME_PERIOD) %>%
summarise(total_employment = sum(values, na.rm = TRUE))
# Function to create a plotly heatmap for a specific year
create_plotly_heatmap <- function(year, showlegend = TRUE) {
df <- country_citizen_summary %>% filter(TIME_PERIOD == as.Date(year))
# Multiply total_employment by 1,000 to reflect real values
df <- df %>% mutate(total_employment_real = total_employment * 1000)
plot_ly(
data = df,
x = ~geo,
y = ~citizen,
z = ~total_employment_real,
type = "heatmap",
colors = colorRamp(c("#f0f9ff", "#003399")),
colorbar = list(
title = "<b>Employment</b>",
tickfont = list(size = 12),
titlefont = list(size = 14, family = "Arial")
),
showscale = showlegend, # Control legend display
zmin = 0, zmax = 50000000, # Set consistent color scale limits
# Custom hover text
hovertemplate = paste(
"<b>Country:</b> %{x}<br>",
"<b>Citizenship:</b> %{y}<br>",
"<b>Total Employment:</b> %{z}<br>",
"<extra></extra>" # Removes default trace info
)
) %>%
layout(
title = list(
#text = paste("<b>Employment by Country and Citizenship -", format(as.Date(year), "%Y"), "</b>"),
font = list(size = 18, family = "Arial"),
x = 0.5, # Center the title
xanchor = "center"
),
xaxis = list(
title = "<b>Country</b>",
tickangle = 45,
tickfont = list(size = 10),
titlefont = list(size = 14, family = "Arial")
),
yaxis = list(
title = "<b>Citizenship Category</b>",
tickfont = list(size = 10),
titlefont = list(size = 14, family = "Arial")
),
margin = list(t = 60, b = 60) # Add padding to top and bottom margins
)
}
# Create plotly heatmaps for each selected year
interactive_heatmap_2000 <- create_plotly_heatmap("2000-01-01", showlegend = TRUE)
interactive_heatmap_2007 <- create_plotly_heatmap("2007-01-01", showlegend = FALSE)
# Combine the interactive heatmaps vertically
combined_interactive_heatmaps <- subplot(
interactive_heatmap_2000,
interactive_heatmap_2007,
nrows = 2,
shareX = TRUE,
shareY = TRUE,
titleX = TRUE,
titleY = TRUE
) %>%
layout(
annotations = list(
list(
text = "<b>Year: 2000 (before EU enlargements)</b>",
x = 0.5,
y = 1.06,
xref = "paper",
yref = "paper",
showarrow = FALSE,
font = list(size = 11, family = "Arial")
),
list(
text = "<b>Year: 2007 (after EU enlargements)</b>",
x = 0.5,
y = 0.50,
xref = "paper",
yref = "paper",
showarrow = FALSE,
font = list(size = 11, family = "Arial")
)
),
title = "<b>Employment Trends by Citizenship Category and Country (2000 vs 2007)</b>",
margin = list(l = 100, r = 50, t = 80, b = 100),
titlefont = list(size = 20, family = "Arial", color = "black")
)
# Display the combined interactive heatmaps
combined_interactive_heatmapsThe heatmaps for the years 2000 and 2007 show notable variations in employment levels across different EU countries and citizenship categories. The color intensity represents employment levels, with darker blue indicating higher employment.
NAT)
dominates in most countries, reflecting a labor market primarily
consisting of native citizens.DE), France (FR), and the
UK.DE), the UK, and Spain
(ES) suggests increased labor migration
and workforce integration of foreign-born
citizens.Heatmaps for 2010 and 2023 were compared to capture the differences in employment levels across countries after two important crises: the 2008 Financial Crisis and the COVID-19 Pandemic.
library(dplyr)
library(plotly)
# Filter data for selected years and only 'Total' for sex and working-age population (15-64)
selected_years <- c("2010-01-01", "2023-01-01")
filtered_data <- data %>%
filter(
TIME_PERIOD %in% as.Date(selected_years),
sex == "T",
age == "Y15-64",
!citizen %in% c("EU27_2020_FOR", "NEU27_2020_FOR", "TOTAL"),
!geo %in% c("EU27_2020", "EA20") # Exclude "EU27_2020" and "EA20"
)
# Group employment by country, citizenship, and year
country_citizen_summary <- filtered_data %>%
group_by(geo, citizen, TIME_PERIOD) %>%
summarise(total_employment = sum(values, na.rm = TRUE))
# Function to create a plotly heatmap for a specific year
create_plotly_heatmap <- function(year, showlegend = TRUE) {
df <- country_citizen_summary %>% filter(TIME_PERIOD == as.Date(year))
# Multiply total_employment by 1,000 to reflect real values
df <- df %>% mutate(total_employment_real = total_employment * 1000)
plot_ly(
data = df,
x = ~geo,
y = ~citizen,
z = ~total_employment_real,
type = "heatmap",
colors = colorRamp(c("#f0f9ff", "#003399")),
colorbar = list(
title = "<b>Employment</b>",
tickfont = list(size = 12),
titlefont = list(size = 14, family = "Arial")
),
showscale = showlegend, # Control legend display
zmin = 0, zmax = 50000000, # Set consistent color scale limits
# Custom hover text
hovertemplate = paste(
"<b>Country:</b> %{x}<br>",
"<b>Citizenship:</b> %{y}<br>",
"<b>Total Employment:</b> %{z}<br>",
"<extra></extra>" # Removes default trace info
)
) %>%
layout(
title = list(
#text = paste("<b>Employment by Country and Citizenship -", format(as.Date(year), "%Y"), "</b>"),
font = list(size = 18, family = "Arial"),
x = 0.5, # Center the title
xanchor = "center"
),
xaxis = list(
title = "<b>Country</b>",
tickangle = 45,
tickfont = list(size = 10),
titlefont = list(size = 14, family = "Arial")
),
yaxis = list(
title = "<b>Citizenship Category</b>",
tickfont = list(size = 10),
titlefont = list(size = 14, family = "Arial")
),
margin = list(t = 60, b = 60) # Add padding to top and bottom margins
)
}
# Create plotly heatmaps for each selected year
interactive_heatmap_2010 <- create_plotly_heatmap("2010-01-01", showlegend = TRUE)
interactive_heatmap_2023 <- create_plotly_heatmap("2023-01-01", showlegend = FALSE)
# Combine the interactive heatmaps vertically
combined_interactive_heatmaps <- subplot(
interactive_heatmap_2010,
interactive_heatmap_2023,
nrows = 2,
shareX = TRUE,
shareY = TRUE,
titleX = TRUE,
titleY = TRUE
) %>%
layout(
annotations = list(
list(
text = "<b>Year: 2010 (after 2008 Financial Crisis)</b>",
x = 0.5,
y = 1.06,
xref = "paper",
yref = "paper",
showarrow = FALSE,
font = list(size = 11, family = "Arial")
),
list(
text = "<b>Year: 2023 (after COVID-19 Pandemic)</b>",
x = 0.5,
y = 0.50,
xref = "paper",
yref = "paper",
showarrow = FALSE,
font = list(size = 11, family = "Arial")
)
),
title = "<b>Employment Trends by Citizenship Category and Country (2010 vs 2023)</b>",
margin = list(l = 100, r = 50, t = 80, b = 100),
titlefont = list(size = 20, family = "Arial", color = "black")
)
# Display the combined interactive heatmaps
combined_interactive_heatmapsThe heatmaps for the years 2010 and 2023 show variations in employment levels across different EU countries and citizenship categories. The color intensity represents employment levels, with darker blue indicating higher employment.
NAT) remains
prominent, but declines (compared to the previous year) are observed in
some countries due to the impact of the 2008 financial
crisis.DE), United
Kingdom (UK) and France
(FR), which were more resilient to the crisis.ES) and Italy
(IT) show noticeable declines (compared to the previous
year), consistent with the severe economic challenges they faced during
this period.NAT) and foreign-born category (FOR),
indicating the labor market rebound after the COVID-19
pandemic.DE), France
(FR), and Spain (ES),
suggesting a return to pre-pandemic trends and increased workforce
integration.UK) and for North
Macedonia (MK), due to UK’s decision to leave
European Union on 2020 and because North Macedonia implemented
Regulation which introduced significant changes to the Labor Force
Survey (LFS).This analysis of labor force data from Eurostat highlights key trends in employment across Europe, focusing on sex, age, geopolitical entity and citizenship. Employment trends show fluctuations driven by events like EU enlargements and economic crises, particularly impacting foreign nationals. Gender disparities persist, though employment levels for both males and females have grown over time.
Geographical differences are evident, with countries like Germany and France contributing significantly to total employment. The analysis of citizenship groups reveals a notable increase in employment among non-EU nationals, especially post-2010. Overall, the findings underscore the importance of addressing employment inequalities and adapting to global shifts in the labor market.
The following sources were consulted and referenced throughout this analysis:
Eurostat. (2024). Labor Force Survey (lfsa_egan). Available at: https://ec.europa.eu/eurostat/databrowser/view/LFSA_EGAN/default/table?lang=en
Eurostat. (2024). Employment and activity - LFS adjusted series (lfsi_emp_a). Available at: https://ec.europa.eu/eurostat/databrowser/view/lfsi_emp_a/default/table?lang=en
European Commission. (2020). The European Union. Available at: https://european-union.europa.eu/easy-read_en
Jonathan Regenstein (2022). Highcharts for R users. Available at: https://www.highcharts.com/blog/tutorials/highcharts-for-r-users/